AITopics | maximum entropy approach

Collaborating Authors

maximum entropy approach

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Distributional Policy Evaluation: a Maximum Entropy approach to Representation Learning

Neural Information Processing SystemsDec-24-2025, 08:37:56 GMT

The Maximum Entropy (Max-Ent) framework has been effectively employed in a variety of Reinforcement Learning (RL) tasks. In this paper, we first propose a novel Max-Ent framework for policy evaluation in a distributional RL setting, named (D-Max-Ent PE). We derive a generalization-error bound that depends on the complexity of the representation employed, showing that this framework can explicitly take into account the features used to represent the state space while evaluating a policy. Then, we exploit these favorable properties to drive the representation learning of the state space in a Structural Risk Minimization fashion. We employ state-aggregation functions as feature functions and we specialize the D-Max-Ent approach into an algorithm, named, which constructs a progressively finer-grained representation of the state space by balancing the trade-off between preserving information (bias) and reducing the effective number of states, i.e., the complexity of the representation space (variance). Finally, we report the results of some illustrative numerical simulations, showing that the proposed algorithm matches the expected theoretical behavior and highlighting the relationship between aggregations and sample regimes.

distributional policy evaluation, maximum entropy approach, name change, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.63)

Add feedback

Deriving the Scaled-Dot-Function via Maximum Likelihood Estimation and Maximum Entropy Approach

Ma, Jiyong

arXiv.org Artificial IntelligenceSep-17-2025

In this paper, we present a maximum likelihood estimation approach to determine the value vector in transformer models. We model the sequence of value vectors, key vectors, and the query vector as a sequence of Gaussian distributions. The variance in each Gaussian distribution depends on the time step, the corresponding key vector, and the query vector. The mean value in each Gaussian distribution depends on the time step, and the corresponding value vector. This analysis may offer a new explanation of the scaled-dot-product function or softmax function used in transformer architectures [1]. Another explanation, inspired by [4], is based on the maximum entropy approach in natural language processing [5]. In this approach, a query vector and key vectors are used to derive the feature functions for the maximum entropy model.

artificial intelligence, machine learning, vector, (17 more...)

arXiv.org Artificial Intelligence

2509.12285

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.83)

Add feedback

Distributional Policy Evaluation: a Maximum Entropy approach to Representation Learning

Neural Information Processing SystemsOct-10-2024, 17:49:52 GMT

The Maximum Entropy (Max-Ent) framework has been effectively employed in a variety of Reinforcement Learning (RL) tasks. In this paper, we first propose a novel Max-Ent framework for policy evaluation in a distributional RL setting, named Distributional Maximum Entropy Policy Evaluation (D-Max-Ent PE). We derive a generalization-error bound that depends on the complexity of the representation employed, showing that this framework can explicitly take into account the features used to represent the state space while evaluating a policy. Then, we exploit these favorable properties to drive the representation learning of the state space in a Structural Risk Minimization fashion. We employ state-aggregation functions as feature functions and we specialize the D-Max-Ent approach into an algorithm, named D-Max-Ent Progressive Factorization, which constructs a progressively finer-grained representation of the state space by balancing the trade-off between preserving information (bias) and reducing the effective number of states, i.e., the complexity of the representation space (variance).

distributional policy evaluation, maximum entropy approach, representation learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.89)

Add feedback

A Maximum Entropy Approach to Collaborative Filtering in Dynamic, Sparse, High-Dimensional Domains

Neural Information Processing SystemsApr-6-2023, 16:29:12 GMT

We develop a maximum entropy (maxent) approach to generating recom- mendations in the context of a user's current navigation stream, suitable for environments where data is sparse, high-dimensional, and dynamic-- conditions typical of many recommendation applications. We address sparsity and dimensionality reduction by first clustering items based on user access patterns so as to attempt to minimize the apriori probabil- ity that recommendations will cross cluster boundaries and then recom- mending only within clusters. We address the inherent dynamic nature of the problem by explicitly modeling the data as a time series; we show how this representational expressivity fits naturally into a maxent frame- work. We conduct experiments on data from ResearchIndex, a popu- lar online repository of over 470,000 computer science documents. We show that our maxent formulation outperforms several competing algo- rithms in offline tests simulating the recommendation of documents to ResearchIndex users.

collaborative filtering, high-dimensional domain, maximum entropy approach, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.66)

Add feedback

A Maximum Entropy approach to Massive Graph Spectra

Granziol, Diego, Ru, Robin, Zohren, Stefan, Dong, Xiaowen, Osborne, Michael, Roberts, Stephen

arXiv.org Machine LearningDec-19-2019

Machine Learning Research Group and Oxford-Man Institute for Quantitative Finance, Department of Engineering Science, University of Oxford Abstract Graph spectral techniques for measuring graph similarity, or for learning the cluster number, require kernel smoothing. The choice of kernel function and bandwidth are typically chosen in an ad-hoc manner and heavily affect the resulting output. We prove that kernel smoothing biases the moments of the spectral density. We propose an information theoretically optimal approach to learn a smooth graph spectral density, which fully respects the moment information. Our method's computational cost is linear in the number of edges, and hence can be applied to large networks, with millions of nodes. We apply our method to the problems to graph similarity and cluster number learning, where we outperform comparable iterative spectral approaches on synthetic and real graphs. Keywords: Networks, Information Theory, Maximum Entropy, Graph Spectral Theory, Random matrix theory, iterative methods, kernel smoothing 1. Introduction: networks, their graph spectra and importance Many systems of interest can be naturally characterised by complex networks; examples include social networks (Mislove et al., 2007b; Flake et al., 2000; Leskovec et al., 2007), biological networks (Palla et al., 2005) and technological networks.

divergence, graph, spectral density, (15 more...)

arXiv.org Machine Learning

1912.09068

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.34)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.62)

Add feedback

Learning partially ranked data based on graph regularization

Nakamura, Kento, Yano, Keisuke, Komaki, Fumiyasu

arXiv.org Machine LearningFeb-28-2019

Ranked data appear in many different applications, including voting and consumer surveys. There often exhibits a situation in which data are partially ranked. Partially ranked data is thought of as missing data. This paper addresses parameter estimation for partially ranked data under a (possibly) non-ignorable missing mechanism. We propose estimators for both complete rankings and missing mechanisms together with a simple estimation procedure. Our estimation procedure leverages a graph regularization in conjunction with the Expectation-Maximization algorithm. Our estimation procedure is theoretically guaranteed to have the convergence properties. We reduce a modeling bias by allowing a non-ignorable missing mechanism. In addition, we avoid the inherent complexity within a non-ignorable missing mechanism by introducing a graph regularization. The experimental results demonstrate that the proposed estimators work well under non-ignorable missing mechanisms.

artificial intelligence, machine learning, mechanism, (18 more...)

arXiv.org Machine Learning

1902.10963

Country: Asia (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

A Maximum Entropy Approach to Collaborative Filtering in Dynamic, Sparse, High-Dimensional Domains

Pavlov, Dmitry Y., Pennock, David M.

Neural Information Processing SystemsDec-31-2003

We develop a maximum entropy (maxent) approach to generating recommendations in the context of a user's current navigation stream, suitable for environments where data is sparse, high-dimensional, and dynamic-- conditions typical of many recommendation applications. We address sparsity and dimensionality reduction by first clustering items based on user access patterns so as to attempt to minimize the apriori probability that recommendations will cross cluster boundaries and then recommending only within clusters. We address the inherent dynamic nature of the problem by explicitly modeling the data as a time series; we show how this representational expressivity fits naturally into a maxent framework. We conduct experiments on data from ResearchIndex, a popular online repository of over 470,000 computer science documents. We show that our maxent formulation outperforms several competing algorithms in offline tests simulating the recommendation of documents to ResearchIndex users.

maxent model, prediction, recommendation, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.61)

Add feedback

A Maximum Entropy Approach to Collaborative Filtering in Dynamic, Sparse, High-Dimensional Domains

Pavlov, Dmitry Y., Pennock, David M.

Neural Information Processing SystemsDec-31-2003

maxent model, prediction, recommendation, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.61)

Add feedback

A Maximum Entropy Approach to Collaborative Filtering in Dynamic, Sparse, High-Dimensional Domains

Pavlov, Dmitry Y., Pennock, David M.

Neural Information Processing SystemsDec-31-2003

We develop a maximum entropy (maxent) approach to generating recommendations inthe context of a user's current navigation stream, suitable for environments where data is sparse, high-dimensional, and dynamic-- conditions typical of many recommendation applications. We address sparsity and dimensionality reduction by first clustering items based on user access patterns so as to attempt to minimize the apriori probability thatrecommendations will cross cluster boundaries and then recommending onlywithin clusters. We address the inherent dynamic nature of the problem by explicitly modeling the data as a time series; we show how this representational expressivity fits naturally into a maxent framework.

maxent model, prediction, recommendation, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.62)

Add feedback